Learning Rate Adaptation in Stochastic Gradient Descent

نویسنده

V. P. Plagianakos

چکیده

The efficient supervised training of artificial neural networks is commonly viewed as the minimization of an error function that depends on the weights of the network. This perspective gives some advantage to the development of effective training algorithms, because the problem of minimizing a function is well known in the field of numerical analysis. Typically, deterministic minimization methods are employed, however, in several cases, significant training speed and alleviation of the local minima problem can be achieved when stochastic minimization methods are used. In this paper a method for adapting the learning rate in stochastic gradient descent is presented. The main feature of the proposed learning rate adaptation scheme is that it exploits gradient– related information from the current as well as the two previous pattern presentations. This seems to provide some kind of stabilization in the value of the learning rate and helps the stochastic gradient descent to exhibit fast convergence and a high rate of success. Tests in various problems validate the above mentioned characteristics of the new algo-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chapter 2 LEARNING RATE ADAPTATION IN STOCHASTIC GRADIENT DESCENT

متن کامل

Coupling Adaptive Batch Sizes with Learning Rates

Mini-batch stochastic gradient descent and variants thereof have become standard for large-scale empirical risk minimization like the training of neural networks. These methods are usually used with a constant batch size chosen by simple empirical inspection. The batch size significantly influences the behavior of the stochastic optimization algorithm, though, since it determines the variance o...

متن کامل

Microscale Adaptive Optics: Wave-Front Control with a mu-Mirror Array and a VLSI Stochastic Gradient Descent Controller.

The performance of adaptive systems that consist of microscale on-chip elements [microelectromechanical mirror (mu-mirror) arrays and a VLSI stochastic gradient descent microelectronic control system] is analyzed. The mu-mirror arrays with 5 x 5 and 6 x 6 actuators were driven with a control system composed of two mixed-mode VLSI chips implementing model-free beam-quality metric optimization by...

متن کامل

Identification of Multiple Input-multiple Output Non-linear System Cement Rotary Kiln using Stochastic Gradient-based Rough-neural Network

Because of the existing interactions among the variables of a multiple input-multiple output (MIMO) nonlinear system, its identification is a difficult task, particularly in the presence of uncertainties. Cement rotary kiln (CRK) is a MIMO nonlinear system in the cement factory with a complicated mechanism and uncertain disturbances. The identification of CRK is very important for different pur...

متن کامل

Online Learning Rate Adaptation with Hypergradient Descent

We introduce a general method for improving the convergence rate of gradientbased optimizers that is easy to implement and works well in practice. We demonstrate the effectiveness of the method in a range of optimization problems by applying it to stochastic gradient descent, stochastic gradient descent with Nesterov momentum, and Adam, showing that it significantly reduces the need for the man...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Learning Rate Adaptation in Stochastic Gradient Descent

نویسنده

چکیده

منابع مشابه

Chapter 2 LEARNING RATE ADAPTATION IN STOCHASTIC GRADIENT DESCENT

Coupling Adaptive Batch Sizes with Learning Rates

Microscale Adaptive Optics: Wave-Front Control with a mu-Mirror Array and a VLSI Stochastic Gradient Descent Controller.

Identification of Multiple Input-multiple Output Non-linear System Cement Rotary Kiln using Stochastic Gradient-based Rough-neural Network

Online Learning Rate Adaptation with Hypergradient Descent

عنوان ژورنال:

اشتراک گذاری